Non-manual grammatical marker recognition based on multi-scale, spatio-temporal analysis of head pose and facial expressions

نویسندگان

  • Jingjing Liu
  • Bo Liu
  • Shaoting Zhang
  • Fei Yang
  • Peng Yang
  • Dimitris N. Metaxas
  • Carol Neidle
چکیده

a r t i c l e i n f o Keywords: American Sign Language (ASL) Non-manual grammatical markers Eyebrow height Head gestures Facial expressions Conditional Random Field (CRF) Changes in eyebrow configuration, in conjunction with other facial expressions and head gestures, are used to signal essential grammatical information in signed languages. This paper proposes an automatic recognition system for non-manual grammatical markers in American Sign Language (ASL) based on a multi-scale, spatio-temporal analysis of head pose and facial expressions. The analysis takes account of gestural components of these markers, such as raised or lowered eyebrows and different types of periodic head movements. To advance the state of the art in non-manual grammatical marker recognition, we propose a novel multi-scale learning approach that exploits spatio-temporally low-level and high-level facial features. Low-level features are based on information about facial geometry and appearance, as well as head pose, and are obtained through accurate 3D deformable model-based face tracking. High-level features are based on the identification of gestural events, of varying duration, that constitute the components of linguistic non-manual markers. Specifically, we recognize events such as raised and lowered eyebrows, head nods, and head shakes. We also partition these events into temporal phases. We separate the anticipatory transitional movement (the onset) from the linguistically significant portion of the event, and we further separate the core of the event from the transitional movement that occurs as the articulators return to the neutral position towards the end of the event (the offset). This partitioning is essential for the temporally accurate localization of the grammatical markers, which could not be achieved at this level of precision with previous computer vision methods. In addition, we analyze and use the motion patterns of these non-manual events. Those patterns, together with the information about the type of event and its temporal phases, are defined as the high-level features. Using this multi-scale, spatio-temporal combination of low-and high-level features, we employ learning methods for accurate recognition of non-manual grammatical markers in ASL sentences. Signed languages are full-fledged natural languages, comparable in structure and complexity to spoken languages but manifested in the visual–gestural modality. Computer-based recognition of sign language from video, which has been the object of various research efforts over the last two decades (e.g., [42,49,47]), is particularly challenging in that it requires attention to detection and interpretation of linguistic information conveyed through both the manual and the non-manual channels. Although …

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

3D Face Tracking and Multi-Scale, Spatio-temporal Analysis of Linguistically Significant Facial Expressions and Head Positions in ASL

Essential grammatical information is conveyed in signed languages by clusters of events involving facial expressions and movements of the head and upper body. This poses a significant challenge for computer-based sign language recognition. Here, we present new methods for the recognition of nonmanual grammatical markers in American Sign Language (ASL) based on: (1) new 3D tracking methods for t...

متن کامل

A Framework for the Recognition of Nonmanual Markers in Segmented Sequences of American Sign Language

Despite the fact that there is critical grammatical information expressed through facial expressions and head gestures, most research in the field of sign language recognition has primarily focused on the manual component of signing. We propose a novel framework for robust tracking and analysis of non-manual behaviours, with an application to sign language recognition. The novelty of our method...

متن کامل

Computer-based recognition of facial expressions in ASL: From face tracking to linguistic interpretation

Most research in the field of sign language recognition has focused on the manual component of signing, despite the fact that there is critical grammatical information expressed through facial expressions and head gestures. We, therefore, propose a novel framework for robust tracking and analysis of nonmanual behaviors, with an application to sign language recognition. Our method uses computer ...

متن کامل

Computer-based Tracking, Analysis, and Visualization of Linguistically Significant Nonmanual Events in American Sign Language (ASL)

Our linguistically annotated American Sign Language (ASL) corpora have formed a basis for research to automate detection by computer of essential linguistic information conveyed through facial expressions and head movements. We have tracked head position and facial deformations, and used computational learning to discern specific grammatical markings. Our ability to detect, identify, and tempor...

متن کامل

Machine learning techniques for automated analysis of facial expressions

Automated analysis of facial expressions paves the way for numerous next-generationcomputing tools including affective computing technologies (proactive and affective user interfaces), learner-adaptive tutoring systems, medical and marketing applications, etc. In this thesis, we propose machine learning algorithms that head toward solving two important but largely understudied problems in autom...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • Image Vision Comput.

دوره 32  شماره 

صفحات  -

تاریخ انتشار 2014